3 research outputs found
AVFace: Towards Detailed Audio-Visual 4D Face Reconstruction
In this work, we present a multimodal solution to the problem of 4D face
reconstruction from monocular videos. 3D face reconstruction from 2D images is
an under-constrained problem due to the ambiguity of depth. State-of-the-art
methods try to solve this problem by leveraging visual information from a
single image or video, whereas 3D mesh animation approaches rely more on audio.
However, in most cases (e.g. AR/VR applications), videos include both visual
and speech information. We propose AVFace that incorporates both modalities and
accurately reconstructs the 4D facial and lip motion of any speaker, without
requiring any 3D ground truth for training. A coarse stage estimates the
per-frame parameters of a 3D morphable model, followed by a lip refinement, and
then a fine stage recovers facial geometric details. Due to the temporal audio
and video information captured by transformer-based modules, our method is
robust in cases when either modality is insufficient (e.g. face occlusions).
Extensive qualitative and quantitative evaluation demonstrates the superiority
of our method over the current state-of-the-art
SIDER: Single-Image Neural Optimization for Facial Geometric Detail Recovery
© 2011 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.We present SIDER (Single-Image neural optimization for facial geometric DEtail Recovery), a novel photometric optimization method that recovers detailed facial geometry from a single image in an unsupervised manner. Inspired by classical techniques of coarse-to-fine optimization and recent advances in implicit neural representations of 3D shape, SIDER combines a geometry prior based on statistical models and Signed Distance Functions (SDFs) to recover facial details from single images. First, it estimates a coarse geometry using a morphable model represented as an SDF. Next, it reconstructs facial geometry details by optimizing a photometric loss with respect to the ground truth image. In contrast to prior work, SIDER does not rely on any dataset priors and does not require additional supervision from multiple views, lighting changes or ground truth 3D shape. Extensive qualitative and quantitative evaluation demonstrates that our method achieves state-of-the-art on facial geometric detail recovery, using only a single in the-wild image.Peer ReviewedPostprint (author's final draft
SIDER: Single-image neural optimization for facial geometric detail recovery
Trabajo presentado en la International Conference on Computer Vision (ICCV), celebrada de forma virtual del 11 al 17 de octubre de 2021In this work we present Sider, a method for high-fidelity detailed 3D face reconstruction from a single image that can be trained in an unsupervised manner. Our approach combines the best from classical statistical models and recent implicit neural representations. The former is used to obtain a coarse shape prior, and the latter provides high-frequency geometric detail, by only optimizing over a photometric loss computed w.r.t. the input image. A thorough quantitative and qualitative evaluation shows that Sider outperforms current state-of-the-art by a significant margin. A limitation of our current approach is that it still cannot handle details like hair or beards and accessories such as glasses. This is because the photometric loss for these regions would require sub-pixel accuracy. In the future, we will explore alternatives for addressing this type of high-frequency details.This work is partly supported by the Spanish government with the project MoHuCo PID2020-120049RB-I00
and Mar´ıa de Maeztu Seal of Excellence MDM-2016-0656.
This work was also supported by a gift from Adobe, Partner University Fund 4DVision Project, and the SUNY2020
Infrastructure Transportation Security Cente